TI0-A16e  151  TAIL  BEHAVIOA  FOB  THE  SUPREHA  OF  GAUSSIAN  PROCESSES  ±/l 

WITH  A  VIEW  TOUAROS  E  (U>  NORTH  CAROLINA  UNIV  AT 
CHAPEL  HILL  CENTER  FOR  STOCHASTIC  PROC 
UNCLASSIFIED  R  J  ADLER  ET  AL  NOV  S5  IR-127  F/G  12/1  NL 


MICROCOPY  RESOLUTION  TEST  CHART 

NATiONAC  0URE4U  OF  STANDARDS  >  I9«)  A 


•OSR-TR^  8  6  "  0  0  1  5 


CENTER  FOR  STOCHASTIC  PROCESSES 


Department  of  Statistics 
University  of  North  Carolina 
Chapel  Hill,  North  Carolina 


OTIC 

iLECTE 
APR  011986 


I 


TAIL  BEHAVIOUR  FOR  THE  SUPREMA  OF  GAUSSIAN 
PROCESSES  WITH  A  VIEW  TOWARDS  B1PIRICAL  PROCESSES 


Robert  J.  Adler 


Genna^  Samorodnitsky 


■istrlbution  unli.it  ed. 


Technical  Report  No.  127 


November  1985 


5 

i 

I 

T 


i 


UNCLASSIF 


SfCOBITV  CCASStFtCATION  OF  THIS  FACE 


REPORT  DOCUMENTATION  PAGE 


1«.  REFOHT  SECURITY  CLASSIFICATION 

UNCLASSIFIED 


a*.  SECURITY  CLASSIFICATION  AUTHORITY 


as.  OECLASSIFICATION/OOWNQRAOING  scheoule 


A  FERFORMINC  ORGANIZATION  REPORT  NUMBERIS) 

Technical  Report  No.  127 


lb.  RESTRICTIVE  MARKINGS 


3.  OISTRIBUTION/AVAILABILITY  OF  REPORT 


ApproTsA  for  pahlle  rolosM  t 
Unlimited  Alstrlbutioaunliiiltod. 


5.  MONITORING  ORGANIZATION  REPORT  NUMBERISI 

AFOSR-TRr  8  6*0015 


6a  NAME  OF  PERFORMING  ORGANIZATION  IBb.  OFFICE  SYMBOL  7a  NAME  OP  MONITORING  ORGANIZATION 

I  (If 

Center  for  Stochastic  Processed  Air  Force  Office  of  Scientific  Research 


Ba  ADDRESS  ICity.  Stmtt  attd  ZIP  Codtt 

Statistics  Dept.,  Univ.  of  North  Carolina 
Phillips  Hall  039-A 
Chapel  Hill,  NC  27514 


^  NAME  OF  FUNOING/SFONSORING 
ORGANIZATION 

AFOSR 


Ba  aOORCSS  (City.  Stmti  and  ZIP  Codtt 

Bolling  Air  Froce  Base 
Washington,  DC  20332 


11.  title  (Include  S^euritv  CImm/ImIionI 

"Tail  Behaviour  for  the  supreme  Of  Gaussian 


7b.  ADDRESS  (City.  Statt  and  ZIP  Codtt 

Bolling  Air  Force  Base 
Washington,  DC  20332 


9.  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 

F49620-85-C-0144 

/ 


10.  SOURCE  OF  FUNDING  NOS. 


PROGRAM 
ELEMENT  NO. 


cesses  with 


ia.  Mrson'al  authoris) 

R/J.  Adler  and  G.  Samorodnitsk 


laa  TYPE  OF  REPORT  13b.  TIME  COVERED  . 

technical  from  9/85  .  to  8/86 


IB.  SUPPLEMENTARY  NOTATION 

Gaudsian  processes,  isonormal  process,  supremum,  metric  entropy.  Brownian  sheet, 
empirical  processes. 


14.  DATE  OF  REPORT  (Yr.,  Mo.,  Dmyt 

Nov.  85 


IB.  PAGE  COUNT 

50 


COSATI  COOES 


SUB.  OR. 


SUBJECT  TERMS  (Continut  on  rtutrtt  if  nteiamry  and  tdrnttfy  by  Moeb  numOtrl 


19.  ABBTRACT  (ConlUutt  on  mtrtt  if  ntctttary  and  idtnttfy  by  Moeb  numbtrt 

Initially  we  consider  "the"  standard  isonormal  linear  process  L  on  a  Hilbert  space 
H,  and  applying  metric  entropy  methods  obtain  bounds  for  the  probability  that  sup^Lx  >  X, 

C  c  H  and  X  large.  Under  the  assumption  that  the  entropy  function  of  C  grows  polynomially, 
we  find  bounds  of  the  form  ^  -l/2X^/a^  ’  where  o*  is  the  maximal  variance  of  L.  We  use  a 

notion  of  entropy  finer  than  that  usually  employed,  and  specifically  suited  to  the  non¬ 
stationary  situation.  As  a  result  we  obtain,  in  the  non-stationary  setting,  more  precise 
bounds  than  any  in  the  literature. 

We  then  treat  a  number  of  examples  in  which  the  power  a  is  identified.  Thesej 
include  the  distribution  of  the  maximiun  of  certain  "locally  stationary"  process  on  IR^,  as 
well  as  those  of  the  rectangle  indexed,  pinne^  Brownian  sheet  on  IR^,  for  which  a=2(2k-l), 
and  the  half-plane  indexed  pinned  sheet  on  K for  which  a  =  2. 


20.  OIBTRIBUTION/AVAILABILITY  OF  abstract 
UNCLABBIPIBO/UNLIMITEO  IS  SAME  AS  RPT.  □  OTIC  USERS  □ 


21.  ABSTRACT  SECURITY  CLASSIFICATION 


UNCLASSIFIED 


22a  name  op 


^e.  OFFICE  SYMBOL  r 


TAIL  BEHAVIOUR  FOR  THE  SUPREMA  OF  GAUSSIAN 
PROCESSES  WITH  A  VIEW  TOWARDS  EMPIRICAL  PROCESSES 


I 


Robert  J.  Adler 

Faculty  of  Industrial  Engineering  &  Management 
Technion-Israel  Institute  of  Technology 
and 

Center  for  Stochastic  Processes 
Statistics  Department 

University  of  North  Carolina  at  Chapel  Hill 
and 

2 

Gennady  Samorodnitsky 

Faculty  of  Industrial  Engineering  &  Management 
Technion-Israel  Institute  of  Technology 


m  fOMi  am  Cl  of  sonanyie  naMMu  cAm) 
loncai  on  nunaniL  fD  me 
thU  r#|»rt  ha*  l« 

•ppr*T«4  r»lMUl*  XAV UR 

eiat'rtfcutien  ta  uaUaltRd* 

Mnau/.  OMPIR 

CbiU.  fMlailasl  ^ 


Research  supported  in  part  by  AFOSR  Constract  Nos.  84-0104,  85-0384  and 
F49620  85  C  0114  while  visiting  the  Center  for  Stochastic  Processes,  Chapel 
Hill,  North  Carolina. 

2 

Research  supported  in  part  by  the  Wolf  foundation. 


SUMMARY 


'^)  Initially-we  consideri^the*'  standard  isonormal  linear  process  L  on  a 

Hilbert  space  H,  and  applying  metric  entropy  methods  obtain  bounds  for  -the 

probability, that  supj.Lx  >  X,  C  c  H  and  X  large. ^  Under  the  assumption  that 

the  entropy  function  of  C  grows  polynomial ly,  we  find  bounds  of  the  form 

cX  e  ^  '  ,  where  jr  is  the  maximal  variance  of  L.  We  use  a  notion  of  entropy 

finer  than  that  usually  employed,  and  specifically  suited  to  the  non-stationary 

situation.  As  a  result  we  obtian,  in  the  non-stationary  setting,  more  precise 

bounds  than  any  in  the  literature.  ,  , 

We  then  treat  a  number  of  examples  in  which  the  power  i  is  identified. 

These  include  the  distribution  of  the  maximum  of  certain  ^'locally  stationary^, 

process  on  1r\  as  well  as  those  of  the  rectangle  indexed,  pinned  Brownian 
1/ 
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Running  head  Suprema  of  Gaussian  Processes 


1.  INTRODUCTION 


We  start  with  some  motivation  from  the  theory  of  empirical  processes, 

letting  Xp...,X^  be  i.i.d.  observations  from  some  k-dimensional  distribution, 

and  assuming  we  want  to  test  the  hypothesis  that  the  parent  distribution  is 

given  by  a  measure  v:  v{A)  =  P{X.  e  A}  on  the  unit  cube.  A  natural  test  pro- 

1  " 

cedure  is  to  form  the  empirical  measure  v^:  v^(A)  =  -  (I^  is  the 

indicator  function  of  A)  and  compare  to  v  via  a  Kolmogorov-Smirnov  type 
statistic  of  the  form 

(1.1)  sup  |v^{A)  -  v(A)l} 

A 

1/ 

for  some  family  A  of  Borel  subsets  of  [0,1]  .  It  is  known  (Dudley  1978,  1984) 
that  v'ff(v^-v)  converges  weakly  to  a  Gaussian  process  on  A,  under  conditions 
related  to  the  size  of  A.  Consequently,  the  study  of  (1.1)  reduces,  in  the 
limit,  to  the  study  of  the  supremum  of  a  particular  Gaussian  process  over  a 
class  of  sets. 

Unlike  the  case  for  their  Markov  counterparts,  however,  it  is  well  known 

that  for  Gaussian  processes  it  borders  on  the  impossible  to  obtain  the  exact 

distribution  of  their  (global)  maxima.  For  stationary  Gaussian  processes  on 

the  line,  for  example,  there  are  only  six  covariance  functions  for  which  the 

precise  distribution  of  the  maxima  of  the  corresponding  processes  are  known 

(c.f.  Slepian  (1961),  Slepian  and  Shepp  (1976),  Cressie  and  Davis  (1981), 

1/ 

Darling  (1983)).  For  random  fields  on  IR  the  situation  is  even  worse,  for  there 
exists  no  non-trivial  Gaussian  field,  either  stationary  or  not,  for  which  the 
precise  distribution  of  the  maxima  is  known.  In  certain  specific  cases, 
however,  upper  and  lower  bounds  to  this  distribution  are  known. 

Goodman  (1976),  for  example,  calculated  good  bounds  for  the  cases  of  the 
pinned  and  regular  Brownian  sheets  in  D?.  (See  Section  4  for  definitions). 


These  have  been  improved  and  extended  to  higher  dimensions  in  Cabana  and 
Wschebor  (1982),  Cabana  (1984)  and  Adler  and  Brown  (1986).  All  but  the  last 
reference  deal  only  with  sheets  arising  from  the  case  v  =  Lebesgue  measure 
in  (1.1).  The  only  other  Gaussian  field  for  which  some  (not  wholly  satisfac¬ 
tory)  bounds  are  known  is  a  two-parameter  generalisation  of  Slepian's 
triangular  covariance  function  (Cabana  and  Wschebor  (1981),  Adler  (1984)). 

Needless  to  say,  in  more  general  situations,  such  as  those  arising  from 
(1.1)  when  the  parameter  space  may  be  a  class  of  sets,  virtually  nothing  is 
known  on  the  exact  distribution  of  the  supremum. 

Partly,  or  perhaps  primarily,  because  of  this  dirth  of  results  a  large 
amount  of  effort  has  been  expended  in  studying  the  asymptotic  properties  of 
Gaussian  maxima.  The  most  central,  and  most  well  known  result  in  this  direction 
is  due  to  four  authors,  Fernique  (1970,  1975),  Landau  and  Shepp  (1971)  and 
Marcus  and  Shepp  (1971),  who  proved  various  versions  of  the  result  that  for 
any  zero  mean  sample  path  continuous  Gaussian  process  X(t),  t  e  S,  and  S  a 
metric  space. 


(1.2) 

where 


-l/(2a^) 


=  sup  E{X^(t)}. 
teS 


An  immediate  consequence  of  (1.2)  is  that  for  all  Xq  >0,  and  any  e  >  0, 
there  exists  a  constant  K=K(e,Xg)  such  that  if  X  >  Xq  then 


(1.3) 


P{sup  X(t)  >  X)  <  K 
teS 


(An  even  sharper  result  than  this  is  due  to  Borell  (1975).  See  comment  3  of 
Section  6.) 
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Our  aim  in  this  paper  will  be  to  perform  a  simple  epsilonectomy  -  i.e.  to 
remove  the  factor  exp(eX^)  from  (1.3).  In  general  this  cannot  be  done  without 
paying  some  price,  and  in  the  cases  we  shall  consider  the  price  will  be  to 
replace  this  exponential  factor  by  a  smaller  power  factor  of  the  form 
a  ^  -1 ,  so  as  to  obtain  bounds  of  the  form 

(1.4)  P{sup  X(t)  >  X}  < 

teS 

for  large  enough  X. 

Results  like  (1.4)  are  not  new.  They  were  obtained  originally  by 
Pickands  (1969a, b)  for  the  class  of  zero  mean,  stationary  Gaussian  processes  on 
[0,1]  whose  covariance  function  R(t)  =  E{X(s)X(s+t)}  satisfies 

(1.5)  R(t)  =  1  -  cit|“  +  odtl'")  as  Itl  ^  0, 

where  a  e  (0,2]  and  c  >  0  are  constants.  Pickands  showed  that  for  each  fixed 
h  >  0  for  which  suPg.^^^^R(t)  =  6^  <  1  for  all  e  >  0 

(1.6)  Urn  1  P{sup  X*  >  X)  =  , 

x‘^%(x)/x 

where  >  0  is  a  finite  constant  depending  only  on  a  and  p  is  a  standard 

normal  density  function.  (Except  for  the  cases  a  =  1,  a  =  2,  the  value  of 

is  not  known.)  This  result  has  been  extended  to  certain  stationary  random 

fields  by  Belyaev  and  Piterbarg  (1972)  and,  more  recently,  to  certain  non-homogeneous 

processes  on  IR^  by  Piterbarg  and  Prisjaznjuk  (1979).  A  proof  of  (1.6),  along 

with  historical  details,  can  be  found  in  Leadbetter,  Lindgren  and  Rootzen  (1983), 

More  recently  Weber  (1978,  1980)  has  obtained  a  set  of  results  which, 
while  they  do  not  identify  constants  as  in  (1.6),  provide  bounds  to  the  distribu¬ 
tions  of  Gaussian  suprema  for  the  widest  possible  class  of  Gaussian  processes. 
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including  the  set-indexed  processes  described  above.  However,  as  we  shall  show 
later,  his  bounds,  when  they  are  of  the  form  of  (1.4),  do  not  always  yield  the 
smallest  possible  value  of  a.  We  shall  have  more  specific  conments  to  make 
about  Weber's  results  later. 

Before  saying  any  more,  it  is  probably  worthwhile  at  this  point  to  explain 
to  the  sceptic  what  we  gain  from  an  epsilonectomy  at  (1.3)  beyond  the  surgeon's 
natural  pleasure  of  neatly  removing  an  unnecessary  appendage  or,  indeed,  from 
sharpening  the  power  in  Weber's  results.  The  first  application  is  purely 
theoretical.  Consider  a  function  valued  Gaussian  process,  i.e.  a  process 

j,  whose  value  at  a  given  time  is  a  Gaussian  random  process.  Such  processes 
arise  naturally  in  a  number  of  ways,  often  by  "relabelling",  for  example,  a 
two-parameter  process  X(s,t)  to  obtain  a  function  valued  under  the  correspon¬ 
dence  Y^(s)  =  X(s,t).  Such  processes  include  the  Kiefer  process  (Kiefer  (1972)) 
of  empirical  process  theory.  Iterated  logarithm  type  results  for  the  growth  of 

sup  Y.(s)  with  t  have  been  studied  in  depth  (see,  for  example,  Goodman,  Kuelbs 
s 

and  Zinn  (1981))  and,  to  a  heavy  extent,  are  based  on  the  inequality  (1.3). 

Finer  results,  such  as  upper-lower  class  theorems  for  sup  Y.(s),  are  much  harder 

s  ^ 

to  obtain  (Kuelbs,  (1975)  is  one  exception  we  are  aware  of)  as  (1.3)  does  not 
provide  fine  enough  information.  A  result  of  the  form  (1.4)  does,  however, 
fulfill  this  need,  and  is  applied  to  this  purpose  to  obtain  upper-lower  class 
theorems  for  empirical  processes  in  Adler  and  Brown  (1986).  Establishing  (1.4) 
in  general,  therefore,  opens  up  the  possibility  of  a  general  upper-lower  class 
theory  for  function  valued  processes. 

For  the  second  application  we  return  to  our  opening  paragraph  and  the 
Kolmogorov-Smirnov  type  statistic  (1.1).  Although  our  results  will  bound  the 
(asymptotic  in  n)  tail  distribution  of  (1.1),  they  will  not  really  do  so 


sharply  enough  to  enable,  say,  the  generation  of  critical  levels  for  statistical 
tests.  This  problem  seems  to  be  hard  enough  that  for  the  foreseeable  future 
this  will  be  done  by  simulation  techniques.  What  a  bound  like  (1.4)  tells  the 
simulator,  however,  is  that  the  critical  levels  depend  on  three  parameters, 
k,a,  and  a^.  As  will  be  shown  in  Section  4,  a  and  can  be  obtained  from  our 
general  theory,  so  that  only  one  parameter  remains  to  be  estimated,  making  the 
simulation  task  much  simpler. 

The  paper  is  organized  as  follows.  In  order  to  treat  the  most  general 
processes  possible,  we  shall  work  initially  with  the  isonormal  Gaussian  process 
on  Hilbert  space.  This,  together  with  requisite  entropy  notions,  will  be 
described  in  the  following  section,  where  we  shall  also  develop  a  version  of 
Fernique's  (1975)  inequality,  that  will  be  the  basis  of  all  that  follows.  In 
Section  3  we  shall  present  a  number  of  theorems  that  show  that  by  putting  more 
and  more  structure  on  the  parameter  Hilbert  space  (via  entropy  conditions) 
finer  and  finer  bounds  on  the  distribution  of  the  maximum  can  be  obtained. 


Proofs  are  deferred  to  Section  5.  Section  4  contains  a  number  of  examples,  in 
which  we  apply  the  results  on  the  isonormal  process  to  specific  problems.  For 
example,  we  obtain  sharp  (in  the  sense  of  best  possible  power  a)  bounds  for 
the  maximum  of  a  rectangle  indexed  Brownian  sheet.  In  Section  6  we  conclude 
with  some  comments. 


Acknowledgements.  Some  of  the  results  presented  here,  when  restricted  to  the 

,k 


class  of  homogeneous  Gaussian  fields  on  IR  ,  have  a  significant  overlap  with 
the  "extended  Fernique  inequality"  in  Berman  (1985a).  We  had  already  obtained 
these  results  independently  before  hearing,  from  Professor  Berman,  of  this 
work.  However,  when  he  very  kindly  sent  us  a  preliminary  (still  untyped) 
version  of  his  results  we  took  advantage  of  the  opportunity  to  combine  what  was 
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best  in  both  proofs,  and  so  the  statements  and  proofs  of  Theorems  3.2  and  3.3, 
when  restricted  to  simple  random  fields,  have  much  in  common  with  his  results. 
As  our  examples  show,  however,  even  for  simple  fields,  our  later  theorems  go 
beyond  his  in  identifying  the  optimal  power. 

We  are  also  grateful  to  Larry  Brown,  who  did  most  of  the  hard  wc-'k  in 
Adler  and  Brown  (1986).  It  was  his  insight  on  the  problems  tackled  there  that 
set  us  off  on  the  current  work. 

Both  a  referee,  and  Professor  Weber  himself,  drew  our  attention  to  the 
results  of  Weber  (1978,  1980).  We  are  grateful  to  Professor  Weber  for  corres¬ 
pondence  helping  to  clarify  the  relationships  between  his  work  and  an  earlier 
version  of  this  paper. 


2.  THE  ISONORMAL  PROCESS  AND  A  FERNIQUE  INEQUALITY 

The  central  idea  is  to  study  one,  canonical,  Gaussian  process, 
and  then  relate  any  particular  process  to  this  one.  It  is  defined  as 
follows.  Call  a  sequence  {X^}  of  random  variables  orthogaussian  iff 
they  are  independent  with  L(X.)  =  N(0,1).  Let  H  be  a  real,  infinite- 

%J 

dimensional  Hilbert  space.  A  linear  map  L  from  H  into  real  Gaussian 
variables  with  EL(x)  =  0  and  EL{x)L(y)  =  (x,y)  for  all  x,yeH  is 
called  the  isonormal  Gaussian  process  on  H.  (c.f.  Segal  (19b4),  Dudley  (1967, 

1973  ).  For  example,  if  {x^}  is  an  orthonormal  basis  for  H  so  that 
for  xeH,  X  =  we  can  let  L(x)  =  ^a^Y^,  where  the  are  ortho¬ 

gaussian. 

Since  Gaussian  distributions  are  uniquely  determined  by  their 
means  and  covariances,  the  isonormal  process  L  can  be  regarded  as  the 
only  real  Gaussian  process.  For,  if  {x^,  teT}  is  any  real  Gaussian 
process  with  mean  Ex^  =  m^,  then  L(x^-m^)  +  m^  is  another  version  of 
the  process,  where  we  take  L2(fi,P)  for  H.  On  H,L  "remembers"  the 


covariance  structure  of  x^,  and,  by  its  linearity,  also  keeps  track 
of  all  joint  distributions.  Thus,  we  can  in  general  neglect  the  speci¬ 
fic  joint  distributions  of  x^  on  (a,P)  and  work  only  with  the 
abstract  geometric  structure  of  the  function  t-»-x^-m^eH.  To  see  pre¬ 
cisely  how  this  works  in  practice,  see  the  examples  in  Section  4, 

In  order  to  study  the  structure  of  H,  we  shall  require  the  notion 
of  metric  entropy.  Let  C  be  a  subset  of  a  metric  space  (S,d).  Given 

£>0,  let  N(C,e)  =  N-(£)  be  the  minimal  number  of  points  x, ,..,,x 

L/  in 

from  C  such  that  for  all  yeC  min  d(x^,y)  ^  e.  We  assume  N  finite 
for  all  e  >  0.  Consequently,  there  exist  sets  A^,...,A|^  covering 
C  such  that  for  all  n  d(x,y)  j<  2e  for  all  x,yeA^.  Set  H^(e)  =  logN^Ce), 
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Then  H^(e)  is  the  metric  entropy  of  C.  Metric  entropy  is  well  known  to  play 
an  important  role  in  continuity  problems  for  Gaussian  processes.  For  example, 

L,  restricted  to  C  c  H,  is  sample  continuous  if  H^(x)  dx  <  <».  Metric 
entropy  can  also  be  used  to  study  suprema  problems.  For  example,  Weber  (1980) 
has  shown  that  if  Mx]!  =1  for  all  x  e  C,  and  certain  other  side  conditions  hold, 
then 


(2.1) 

where 


P{sup  |Lxl  >  X  +  n  }  <  const.  N(C,v(X))'l'(X) , 
XeC  A  -  . 


H'(X)  =  P{|Lxl  >  X}  = 

rpv(x) 


.-ho 


du. 


==  p(p-l) 


[H(C,e)  -  log  de. 


0 


v(X)  =  inf  {0  <  e  <  £«:  h(e)  <  X}, 


h(e)  =  [H(C,e)  -  log  e]'^. 


Cq  =  inf  {0  <  e  <  1:  N(C,e)  <  2} 

and  p  e  (0,1)  is  arbitrary.  Assuming  is  small  enough  for  large  X  (as  is 
usually  the  case),  that  v  is  at  most  polynomial,  and  that  the  entropy  is 
polynomial,  we  see  that  (2.1)  is  a  result  of  the  form  of  (1.4),  which  is  what 
we  are  seeking. 

There  are,  however,  two  difficulties  with  Weber's  result,  insofar  as 
general  best  upper  bounds  are  concerned,  and,  in  particular  in  relation  to  the 
examples  from  the  theory  of  empirical  processes  that  motivated  us.  The  first 
is  the  assumption  that  ||xl|  =1  for  all  x.  It  is  possible  to  get  around  this 
in  the  general  case  by  noting 
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(2.2)  P{sup  Lx  >  X}  <  P{sup  Ly  >  A} 

XeC  yeC  ° 

where  a  =  sup  | lx| | ,  and  C  =  {y:  y  =  x/| jx] j ,  x  e  C).  It  is  not  hard  to  see 
that  the  entropy  function  for  C  follows  the  same  general  behaviour  of  that  for 
C,  and  since  l|y|l  =1  for  y  e  C  Weber's  result  then  gives  a  bound  for  (2.2). 
However,  it  is  easy  to  check  via  examples  such  as  Example  4.1  that  this  procedure 
does  not  give  the  sharpest  bounds  possible. 

The  second  difficulty  to  somewhat  more  fundamental,  and  essentially 
insurmountable,  even  if  Weber's  results  did  not  assume  llx]]  =  1.  It  lies  in 
the  fact  that  a  methodology  based  purely  on  metric  entropy  can  never  always  give 
the  best  bounds.  To  see  this,  one  example  will  suffice.  In  Section  4  we  show 
how  to  calculate  supremum  distributions  for  general  processes  by  assigning  to 
each  process  a  particular  Hilbert  space,  and  then  studying  L  on  that  space.  It 
is  easy  to  see  that  the  Wiener  process,  W(t),  t  e  [0,2]  and  the  stationary 
Slepian  process  :=  W^^.|  -  W^,  t  e  [0,1]  generate  identical  (up  to  a  constant) 
entropy  functions  since 

E  dW^  -  WJ^=  It  -  s|  =  JjE  {|S^  -  SJ^},  0  <  s,t  <  1. 

Thus  any  bound  for  the  supreme  distributions  of  W  and  S  on  [0,1]  coming  from 

metric  entropy  considerations  involving  only  H  must  be  the  same.  But  it  is 

well  known  that  whereas  P{sup  W.  >  X)  =  0(x”  e”'^  ),  we  have 

[0,1]  ‘ 

P{  sup  S.  >  X)  =  0(e'’‘'“). 

[0,1]  ‘ 

In  general,  then,  the  problem  is  that  different  processes  may  have  essentially 
the  same  metric  entropy,  but  quite  different  suprema  distributions. 

In  order  to  solve  this  problem  we  shall  require  finer  partitions  on  C 


0 


than  those  obtainable  just  from  entropy  considerations.  To  this  end, 
for  given  6^0  set 


(2.1)  C^'*’  =  {xeC:l|xI|  >  6},  Cg  =  {XeC:||x||  _<  6}, 


where  Cc  H  and  || .  ||  is  the  H-induced  norm.  Now  define 


(2.2)  N^^6,e);=  N(C^'^,e),  N(."(6,e):  =  N(Cg‘,e). 

Since  C  =  C^^UC^",  it  is  obvious  that  N^(e)  £Nj.^(6,e)  +  Nj."(6,e) 
for  all  6  and  e.  We  shall  need  one  more  entropy  function. 


(2.3)  Nq(6i,62ic)  •“  ^  »^)»  ®  ^  ^  '^2* 


>  0. 


The  motivation  behind  this  last  entropy  function  should  be  clear. 
The  idea  is  to  first  break  up  C  into  regions  over  which  L(x)  has 
a  variance  (=||x|p)  within  certain  bounds,  and  then  to  measure  the 
"size"  of  each  of  these  regions  via  entropy  considerations.  This  will 
provide  the  finer  information  we  shall  need  (particularly  for  non- 
homogeneous  processes  for  which  ||x||  is  not  constant  over  H)  to  obtain 
sharp  bounds  for  the  distribution  of  supL(x). 

We  can  now  commence  setting  up  the  basic  (Fernique  type) 
inequality  from  which  all  our  other  results  will  ultimately  follow. 

To  this  end,  set 

a  =  sup  I  lx  1 1 . 

XeC 


Let  6^.  be  a  sequence  satisfying  0  =  <So<<s^<...<6^=a, 

with  m  possibly  infinite.  For  each  i  =  1,...,m  let  e. j=l ,2, . . . , 

^  3 

be  an  infinite  monotone  sequence  such  that  lim  e. .  =  0.  We  shall  use 

j-KO 

these  two  sequences  to  partition  C  as  the  union  of  C (i5j  _-j ,  6^. ) .  where 

c(v,n)  :  =  =  {xeC:  v  <  ||x|l  <  nl,  0  <  v  <  n  l  a. 

Note  that  for  every  j  there  is  a  finite  collection  of  points  of 
) »  which  we  shall  denote  by  C^.j,  satisfying 

(2.4)  4C.j=N^(4..,.4,.e,j)  . 

(2.5)  for  all  yeC(6.._^  ,6^ )  there  exists  an  xeC^.^.  such  that 

ll*-y||  <  • 

(Here  #A  is  the  cardinality  of  A.) 

We  shall  need  one  more  double  sequence  A..,  i=l,...,m,  j=0,l,2,.. 

^  J 

of  positive  numbers.  Clearly 

(2.6)  P{suplLx|  > 

xeC .  ^ 

where 

(2.7)  i|)(u)  =  /Z/tt  /  e  dx  . 

u 

Furthermore,  for  each  X€C(6.j_^  ,6^ )  there  is  a  point  x^.j(x)eC^j  such 
that  ||x-x.  .||  <  e-.j  .  Consequently 

1  J  I  J 


';t5| 

\/-l 

'■.fl 

VI 

V 

'‘*1 


I 


% 

*s1 


P{  sup  ILX  -  Lx..(x)|  >  X..c.j} 


from  which  follows  that 


(2.8) 


P{sup 


sup  |Lxl  >  X.q6. 
^  i . j+1 


-  k=o^C^^-r^*^i,k+l^’^^Nk^  • 

Now  note  that,  as  j  -►  »,  becomes  dense  in  C(6^_^,5^).  Consequently, 
choosing  a  separable  version  of  L  we  obtain  from  (2.8)  that 


J=I  J-U 

It  is  now  trivial  to  check  the  truth  of  the  following  inequality,  which 
forms  the  basis  of  the  remainder  of  the  paper. 

Basic  Inequality  For  sequences  6^,  x^j  ari^  satisfying 
0  =  6o  <  6i  <  . . .  <  6m  “  'J  (^  possibly  infinite)  and  e  _  '«  0 
as  j  ^  »  for  all  i ,  separable  versions  of  L  satisfy 

m  m  “ 

(2.9)  P{sup  |Lxl  >  Z  X-o^i  +  I  I 
XeC  i=l  i=l  j=l 

m  » 

Note  that  this  basic  estimate  is  extremely  general,  and  not  parti¬ 
cularly  informative.  Our  task  now  will  be  to  propose  meaningful,  checkable 
conditions  on  Nj,(v.rt.e)»  and,  by  judicious  choices  of  the  various 
sequences  in  the  basic  inequality,  reduce  the  various  sums  in  (2.9)  to 
simple,  useful,  forms. 


3.  MAIN  RESULTS 


There  are  basically  two  types  of  possible  growth  rates  for  entropy 
functions  that  yield  interesting  results  on  sup  Lx,  polynomial  growth 
of  the  form  Nj>(e)  ae"*^  ,  or  exponential  growth  of  the  form  N^(e)  a  expCe'*^). 
Faster  than  exponential  growth  rates  yield  discontinuous,  unbounded  pro¬ 
cesses  for  which  no  non-trivial  bound  on  the  distribution  of  sup  L  can 
exist,  and  slower  than  power  rates  are  generally  just  not  interesting. 

In  this  paper  we  shall  study  only  polynomial  entropies,  and  shall  show 
how  to  relate  the  <  above  to  the  a  of  (1.4).  For  some  remarks  on 
exponential  entropies,  see  Section  6. 

Polynomial  entropies,  while  initially  seemingly  restrictive,  cover 
a  wide  range  of  examples,  including  random  fields  indexed  by  finite 
dimensional  Euclidean  space  and  processes  indexed  by  spaces  of  sets, 
such  as  polygons,  that  are  describable  by  a  finite  number  of  parameters. 

Processes  indexed  by  Vapnik-Cervonenkis  classes  of  sets  or  functions 
(c.f.  Section  6)  are  also  described  by  polynomial  entropies,  (c.f.,  for 
example,  Dudley  (1973,  78,  84).) 

For  the  first  result,  we  shall  assume  only  minimal  information  on 
C,  which  also  turns  out  to  be  all  that  is  required  if  L  is  stationary 
on  C  (implied  by  ||xll  =  const,  for  all  xeC  and  (x,y)=f(x-y)  for  all 
x,y€C  and  some  positive  definite  f).  To  be  more  precise,  we  assume 
there  exist  positive  constants  a  and  <  such  that 

(3.1)  N(.(e)  E  N(.(O,0,e)  <  ae*'" 

for  small  enough  e  .  Then  it  is  easy  to  show  via  the  basic  inequality 
(2.9)  (c.f.  Section  5)  that  for  large  enough  p  >_  2  and  all  x  >  (l+4Ki!,np)^ 


(3.2) 
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P{sup  |Lx|  >A(a+2p"^)}  if-ap^'^J  e“*®^^du  . 
xeC  X 

To  the  reader  acquainted  with  Fernique  (1975)  this  inequality  should 
appear  familiar,  for  he  has  a  similar  inequality  for  processes  on 
Euclidean  space.  It  is  in  fact  a  simple  matter  to  derive  Fernique's 
inequality  from  (3.2). 

Via  (3.2)  it  is  not  hard  to  prove  the  following  result,  closely  related 

to  Thdorfeme  2.1  of  Weber  (1980)  in  the  case  j |x| |  =  a  =  1  for  all  x. 

Theorem  3.1  Suppose  N^(e)  <  ae"*^  for  all  ee(0,eQ].  Define  the 

following  constants. 


max(eQ"'*,2,2<+l ) 


0  <  K  <  4, 


b  =  b(<,eQ)  =  < 

» 

=  |a(a-F's), 


max(eQ~^,2,l  +  2v^  )ln  <)  < 
=  |a(o+i5)exp{(2a  +  . 


>  4, 


Then,  for  all  X  ^  2b(a  +  h)^, 

(3.3)  P{sup  |Lxl  >  X}  <  exp{2(a+x‘^)/o‘*} 

XeC  ' 

Two  things  should  be  noted  about  this  result.  The  first  is  that 
since  the  assumptions  assume  nothing  about  the  variation  of  llx||  on  C, 
(3.3)  is  unlikely  to  lead  to  sharp  bounds  for  non -homogeneous  processes. 
In  fact,  it  doesn't.  Secondly,  the  constants  in  (3.3),  while  a  little 
unwieldy,  are  identifiable.  As  we  assume  finer  structure  on  C,  while 
we  shall  get  smaller  powers  for  the  power  of  x  in  (3.3),  we  shall  lose 
track  of  the  constants.  (In  principle,  we  could  always  keep  track  of  the 


constants,  but  one  reaches  a  point  where  they  become  so  complicated  that 
it  no  longer  seems  worthwhile  to  expend  the  not  inconsiderable  effort 
required  to  do  so.) 

Our  iirst  step  away  from  homogeneity  will  be  to  divide  C  into 

two  regions,  in  one  of  which  |lx|l  is  close  to  its  maximum  a,  and  to 

concentrate  on  the  separate  entropies  of  these  regions.  In  particular, 

from  experience  with  Gaussian  processes  on  (e.g.,  Berman  (1985b)) 

we  should  expect  that  the  distribution  of  P{sup|Lx|  >  x}  for  large 

C 

should  be  determined  primarily  by  the  entropy  Nj.(6,a,e)  as  6^0. 

This  idea  leads  to  the  following  result,  in  which,  in  most  applications, 
we  shall  choose  an  f  such  that  f(6)^0  as  6  ^  a. 

Theorem  3.2  Let  f:{0,a)  -►  R  be  such  that 

there  exist  positive  constants  a,  <  an^  such  that  for  all 

e  e(0,eg],  6e(0,a), 

(3.4)  Ng(0,6,e)  £  ae  ,  Ng{6,a,ef(6))  £  ae 
Then  for  each  6  and  all  \  >  x*(eQ,6,a,ic,f)  we  have 

(3.5)  P{sup  |Lx|  >  X} 

XeC 

<  I-  a(atl  (8)  +  [x-^  t 

a"  2 

<  Mx"^e‘^^^^®^{X^'^f^(6)  +  [x"^  +  ^0-6)]’'^  } 

where  M  =  |-  a (a+^) exp {^-^•2111}  gn^  X*  is  the  smallest  x  satisfying 

a** 

the  following  three  conditions: 

X  i  [m1n(ii,c„)  -  i2^]-'x  , 


(3.6) 
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(3.7)  X  ^  max(2,ef.“*^).f‘''^(6). 


(3.8)  X  >  { 


2 (a  +  ^)^(2k+1 ) 

2(a  +  i5)^(l+2i^2,nK:) 


0  <  K  <  4, 


K  ^  4  . 


Note  how  the  conditions  on  the  constants  are  becoming  unwieldy. 

To  see  how  this  result  works,  let  us  prove  a  simple  corollary. 

The  idea  of  the  corollary  is  to  introduce  a  parameter  of  "non-homogeneity", 
a  ,  for  C  that  describes  the  sizes  of  subsets  of  C  over  which  ||x|| 
is  close  to  its  overall  supremum  o.  Homogeneity  is  described  by 
a  =  0,  with  increasing  a  describing  increasing  non-homogeneity.  The  result  is 

Corollary  3.1  Under  the  conditions  of  Theorem  3.2,  if  f  satisfies 

(3.9)  f(6)  <  c(a-5)“ 

for  some  positive  a  and  c  then  for  sufficiently  large  x 

(3.10)  P{SUp|LxI  >  X}  <  MX’"'  ^  2</(l+a)g-x2/2a2  ^ 

.  XeC 

where 

M  =  |a(c  +  2'")(a  +  is)exp  2(a+l)/an  . 

(The  interested  reader  can  easily  substitute  into  (3.6)  -  (3.8)  to 
make  the  statement  "sufficiently  large  x"  more  precise.) 

Proof.  Set  6  =  0  -  X  2/(l+a)^  taking  x  large  enough  for  6  to  be 
positive.  It  is  then  straightforward  to  check  that  (3.6)  -  (3.8)  are 
satisfied  for  large  enough  x  .  Clearly,  as  x-x»  we  have  6->-a  .  To 
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prove  the  corollary  consider  the  last  term  in  (3.5) 

+  [a"^ <_  +  {X~^  +  hX~^^ 

<  (C  +  2^)A  . 

again  for  sufficiently  large  A  .  Substituting  this  into  (3.5)  establishes 
the  corollary. 

Note,  again,  that  large  A  sends  6  to  a.  That  is,  it  is  only 
the  neighborhood  in  C  for  which  |(x||  is  close  to  a  that  has  any  effect  on 

the  distribution  of  sup|Lx|.  To  convince  ourselves  that  the  assumption 

(3.9)  has  actually  led  to  a  sharper  bound,  we  need  only  note  that  the 
power  of  A  in  (3.10)  is  never  larger  than  that  in  (3.3),  where  no  such 
assumption  was  made. 

Our  next  assufnption  on  C  will  be  that  it  possesses  some  sort 
of  scaling  property,  in  the  sense  that  there  are  subsets  of  C  which 

look  much  like  C  itself,  except  that  the  original  norm  has  been  changed 

by  a  scaling  factor.  The  idea  then  is  to  partition  C  into  a  number 
of  smaller  pieces,  study  the  supremum  on  each  one  of  these  via 
Theorem  3.2,  (to  yield  Theorem  3.3)  and  then  piece  the  various  bounds 
together  to  bound  the  supremum  over  C  itself,  (Theorems  3.4,  3.5). 

To  this  end,  fix  e  >  0  and  let  G  be  a  partition  of  C  satisfying 

(3.11)  sup  l|x-yll_<e  for  all  AeG 
X ,yeA  ® 

Define  NQ^(e):  =  #Gg.  Clearly  N^^(6)  ^  latter 

entropy  is  related  to  an  G  of  minimal  cardinality.  In  general 
however  we  shall  want  to  choose  G^  so  that  both  entropies  are 
effectively  the  same.  Now  we  introduce  the  "scaling  hypothesis", 
by  assuming  the  existence  of  a  function  f  and  a  constant  a  such  that 
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(3.12)  N^(f(e)e)  _<  ae”*^  for  all  A  e  Gq, 

and  small  enough  e,e  >  0.  Such  an  f  always  exists.  (Take  f  =  1!)  Clearly, 
however,  for  this  partitioning  procedure  to  have  any  value,  we  shall  want 
f(0)  0  as  0  0.  Nevertheless,  it  is  not  necessary  to  assume  this  at  this 

stage,  and  the  bounds  in  Theorem  3.3  and  its  corollaries  are  correct  for  any  f. 
If  f  does  not  decrease  to  zero,  however,  they  are  uninteresting. 

Note  that  it  would  be  nice  to  replace  (3.12)  with  the  more  pleasing 
condition  N^(f(0)e)  _<  Nj,(e)  comparing  entropies.  However,  such  a  condition 
turns  out  to  be  impractical  in  examples,  since  we  generally  do  not  have  the 
precise  form  of  NQ(e),  but  only  its  growth  rate. 

p 

Note,  also,  that  we  can  always  take  Nq(0)  to  be  non-increasing,  and, 
given  some  f  satisfying  (3.12),  its  left  continuous  monotone  (non-decreasing) 
rearrangement  also  satisfies  (3.12).  Thus,  in  what  follows,  we  shall  always 
take  f  left  continuous.  Consequently,  fixing  some  p  ^  2,  the  function 

g(0)  :=  0  +  2f(0)/p^ 


can  also  be  taken  to  be  left  continuous,  so  that  its  inverse 
g“^(n)  :=  sup  {0:  g(0)  <  n) 


is  well  defined.  We  can  now  state  the  following  result  which  is  closely 
related  to  Thdorfeme  2.1.1  of  Weber  (1978)in  the  case  l|xll  =  1.  Our  style 
of  proof  is  completely  different  however. 


Theorem  3.3 

0t(O,9Q],  Gg 

any  AeG^,  a 


Suppose  N^(e)  <  ae"'^  for  ee(0,eQ],  and  that,  for  all 
and  f  satisfy  (3.12).  Then  for  every  p^max(2,c 
=  Sjjpj|x|l  ,  and  all  x  >  g(e)(l+4icJ!,np)*^ 


), 


(3.13)  P(sup  |Lx]  >  A)  n;/(Cx-g(9)(l+4Kiinp)'^]/a.) 

XeA  2 

+  4ap  '^ij)(A/g(0)) 

+  4ap2\x'''e‘^'/^'^A*  exp(x2g2{9)/2o^M  . 


There  is  an  easy  corollary  to  this  theorem  that  is  far  more 


illuminating.  For  large  enough  x,  set 
(3.14)  0^  =  g'^[x2(l+4Kiinp)]‘‘^) 

and  substitute  into  (3.13).  Then  apply  the  standard  inequality 
-1  -Uu^ 

ip(u)  <  u  e  ,  u  0,  to  obtain 

Corollary  3.2  Under  the  conditions  of  Theorem  3.3  we  have,  for  all 

X  >  max(l  .1  ,{g(t^)(l+4Kiinp)'^}"^),  AcG  , 

®x 

P{sup  |Lx|  >  X}  £C,X  *e  +  c„x”^exn{-^x‘+(l+4ici,np)} , 

XeA  ^ 

where 

Cj  =  6a^e^^°  +  4ap2^o^  2exp{(2a^‘+(l+4ic]!,np)  )"^ } 

=  4ap2'^(l+4Kj!,np)'^ 

(The  constant  6  in  comes  from  X  >  1.1.  In  general,  6  can  be  replaced 
by 

An  irritating  aspect  of  both  Theorem  3.3  and  its  corollary  are 

that  the  constants  diverge  as  -►  0.  The  same  phenomenon  occurs  in 

Berman's  (1985a)  Theorem  3.1.  In  the  following  corollary,  we  show  that 

this  can  easily  be  avoided  via  a  simple  trick,  due,  a  referee  tells  us, 
to  Ldvy. 

Corollary  3.3  Both  Theorem  3.3  and  Corollary  3.2  hold  if  we  replace 

0  ^  a,  t  as  long 


in  the  bounds  by  an 


as  we  then  double  the  constants. 
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(3.15)  P(suplZ  1  >  X)  <  P(sup  Z  >  X)  +  P(inf  Z.  <  -x) 

^  t  ^  t  ^ 

=  2P(sup  Z.  >  X,  Y^O)  +  2P(inf  Z.  <  -X,  Y  £  0) 
t  ^  t  ^ 

<  2P(sup  |Z  +Y|  >  X) 
t  ^ 

To  use  this  inequality,  take  a  1  and  Y  zero  mean  Gaussian 
with  variance  ,  independent  of  Lx  for  all  xeA,  and 

define  a  new  process  L*  by  L*x  =  Lx  +  Y.  Consider  the  image  of 
A  under  L*  call  it  A*,  as  part  of  an  space  of  Gaussian 

variables,  where  for  any  two  points,  u,v  in  the  image  such  that 
u  =  L*x,  V  =  L*y,  x,yeA  their  inner  product  (u,v)*  is  given  by 
E(L*x,L*y).  Then  clearly 

l|u||*  =  l!x|l  +  a2-  o2^,  Ilu-v|l*  =  ||x-yll  . 

Consequently,  sup^Jlu||*  =  and  A*  has  the  same  entropy 

function  as  A.  Let  I  be  the  identity  map  on  this  set.  Then  I  is 
clearly  isonormal  on  A*,  and  sup^JIul=  su 
Theorem  3.3  and  Corollary  3.2  to  I  and  then  note  (3.15)  with  Z  =  L  to 
prove  the  corollary. 

Now  let  us  pause  for  a  moment  to  consider  the  import  of  Theorem 
3.3  and  its  corollaries.  It  is  clear  from  Corollary  3.2  that  for  large 
X,  we  find  that  the  dominant  term  in  the  bound  is  ©(x'^e""^^  But 

this  is  of  the  order  of  the  probability  that  a  single  zero  mean 
Gaussian  variable  with  variance  is  greater  than  x.  That  is, 

we  have  replaced  the  supremum  of  L  over  A  by  its  value  at  one 
point  only.  Essentially,  this  has  been  done  by  making  A  small  as  x 

becomes  large,  since  AeG  and  e.  will  be  small  for  x  large. 

ex  X 

That  is,  we  have  achieved  at  this  stage  a  discretization  of  the  supremum 


p^L*xl.  Thus,  we  can  apply 


problem.  This  is  actually  the  heart  of  the  solution,  for  all  we  need 
do  now  is  sum  the  bounds  of  Theorem  3.3  and  its  corollaries  over  the 


various  sets  in  G  to  bound  the  supremum  over  the  whole  of  C. 

6 

To  sum  these  bounds  efficiently,  we  require  further  assumptions 
on  the  structure  of  C  ,  as  in  the  following  two  results,  with  which 
we  complete  this  section,  and  in  which  we  finally  give  up  trying  to 
keep  track  of  constants.  In  the  first  result  we  shall,  as  in  Theorem  3.2 
concern  ourselves  primarily  with  regions  of  C  of  large  norm. 

Theorem  3.4  Suppose  N^(e)  £  for  e  e(0,eQ],  and  that  there 
exist  constants  c  and  6  such  that  for  each  6e(O,0Q]  there 
exists  a  partition  of  C  and  constants  ng,6Q(e)  so  that 


(3.16)  n(6,e)  £  c(a-6)^N^^(e)  +  n^  for  all  6e(O,6q(0)]. 
where 

(3.17)  n(6,0):  =  #{Ae  G^:  AnC^'^  ?«  ()>}. 


Then,  there  exist  constants  c^  and  such  that  for  sufficiently 
large  a 


(3.18)  P(sup  iLxl  >  a)  £C^N-^(0,)A'^"^^(jinA)®e"^^/^''^ 
xeC  I  C  a 

.  ^  „  ,-l  -A2/2a2 


Here 


and  depend  on  c,3,a,6Q,  and  an  arbitrary  p,  but 


not  on  A.  The  factor  0  is  defined  at  (3.14). 

A 

Our  final  task  is  to  free  ourselves  of  the  logarithmic  term  in 

(3.18)  by  partitioning  C  even  more  finely. 


Theorem  3.5  Assume  the  assumptions  of  Theorem  3.4,  but  replace  (3.16) 
by:  There  exists  a  4^(0)  such  that  for  all  0  <  6^  -  6^  <  Aq 


4,  EXAMPLES 


Our  examples  are  of  two  kinds.  In  some  we  simply  re-derive 
known  results.  Our  aim  here  is  to  show  that  the  rather  general  theorems 
of  the  previous  sections  give,  when  applied  to  specific  cases,  the  best 
possible  results.  The  more  interesting  examples  which  (by  "induction") 
we  also  feel  give  the  best  possible  bounds,  are  new.  In  particular. 
Examples  4.3  and  4.4,  which  consider  the  supreme  of  rectangle  and  half¬ 
plane  indexed  Brownian  sheets,  represent  the  first  time  sharp  (asymptotic) 
bounds  have  been  obtained  for  set  indexed  processes. 

All  our  examples  deal  not  with  the  isonormal  process  on  Hilbert 
space  H  but  with  processes  whose  parameter  space  is  generally  some¬ 
what  simpler.  Thus  we  shall  have  to  translate  these  processes  to  the 
isonormal  case.  But  this  is  easy,  for  if  is  a  Gaussian  process 
on,  say,  a  metric  space  (S,d)  with  continuous  covariance  function 
R(s,t),  then  we  simply  identify  H  with  the  space  of  X,  and 
CcH  with  the  set  {xeH:  x  =  X^  form  some  teS}.  For  x=X^,  y=X^ 
in  C  we  have  (x,y)j^  =  R(t,s).  Clearly  L  is  now  the  identity  operator, 
so  that  Lx  is  simply  x  identified  as  a  Gaussian  variable  rather  than 

an  element  of  H.  Furthermore  sup]  Lx|  =  sup|X  ]. 

XeC  teS 

Entropy  calculations  are  only  slightly  more  involved,  for  we  shall 
generally  partition  C  by  first  partitioning  S  (this  is  usually  geo¬ 
metrically  simpler)  and  then  letting  the  above  identification  induce 
a  corresponding  partition  on  C.  We  shall  work  the  first  example  carefully 
to  explain  what  is  happening.  In  the  later  examples,  we  shall  skimp  on 


Example  4.1 .  Let  X  be  a  stationary,  separable  process  on  [0,1] 
with  zero  mean  and  covariance  function  R(t),  which,  for  some  positive 


'S 


I 

•l! 

§ 


|l 


a^ ,  e  and  y-]  satisfies 

(4.1)  l>R(t)  >  1-a^t^  for  all  te[0,Yi] 

Let  a(t)  be  a  positive,  continuous,  monotonically  increasing 
function  on  [0,1]  such  that  for  some  y2  >  0  <  a2  £  a^  and 

some  a  >  0 


(4.2)  whenever  jt-sj  <  y2  . 


Define  now  a  scaled  version  of  X  by 


Y(t)  =  a(t)X(t),  te[0,l]. 


We  think  of  Y  as  a  locally  stationary  process,  (c.f.  Berman  (1974)) 


and  shall  show  that 


(4.3)  P{sup| Y(t) I  >  A}  < 

[0,ll 


for  some  finite  c^  and  c^  and  all  A  >  0. 

Before  we  prove  (4.3),  which  we  shall  do  via  Theorem  3.5,  it  is 
instructive  to  consider  how  close  we  could  get  to  (4.3)  via  existing 
theory.  If  we  apply  Berman's  (1985a)  recent  bound,  then  the  best  we 


can  do  is  a  bound  of  the  form 


(4.4)  P{suplY(t)l  >  A}  < 

[0,ll 


„-U2/B,-»W0)  o<6<2«. 


m 


25 


This  is  clearly  poorer  than  (4.3).  [A  proof  of  (4.4)  follows  easily 
from  (4.5)  below  and  Example  4.1  of  Berman  (1985a).]  The  above  result 
could  also  be  obtained,  within  the  framework  of  this  paper,  via  Theorem  3.3, 
which  is  effectively  the  analogue  of  Berman's  result  for  the  isonormal 
process. 

One  could  also  try  to  apply  Weber's  (1980)  Thdorfeme  2.1  here.  In  fact, 
his  result  is  not  strictly  applicable,  unless  strict  equality  hold  in 
(4.1)  and  (4.2).  Assuming  this,  one  obtains  a  result  like  (4.4),  but  with 
an  extra  factor  of  log  X  in  the  bounds.  Thus  Weber's  result  is  weaker 
yet  than  Berman's. 

Finally,  before  commencing  the  proof,  we  note  that  bounds  similar 
to  (4.3)  have  been  obtained  for  processes  displaying  covariance  behaviour 
similar  to  that  displayed  by  our  Y(t)  by  Piterbarg  and  Prisjaznjuk  (1979). 
They  actually  do  better  than  (4.3)  for  their  case,  for  using  arguments  in 
the  style  of  Pickands  (1969a, b)  they  both  identify  the  constants  in 
their  bound  and  show  that  the  bound  is  sharp. 

Throughout  the  proof  we  shall  consider  Y(t)  to  be  both  a  random 
variable  and  a  point  in  H.  From  (4.1)  and  (4.2)  we  have  that  for  all 
s,t  with  |t-s|  <  A  Y2 

(4.5)  ||Y(t)  -  Y(s)||^  =  E(la(t)X(t)  -  a(s)X{s)|2) 

<_  a^lt-si^®  +  20^(1  )a^  |t-s|^  . 

We  now  divide  the  argument  to  two  distinct  cases,  and  consider  firstly 
B  ^  2a.  Then,  via  (4.5) , 

(4.6)  ||V(t)  -  Y(s)||  <  (a^  +  .= 


/  / 


To  partition  C,  for  each  e  >  0,  we  simply  partition  the  unit  interval 
in  sub-intervals  each  of  length  (2e/a^)^'^“,  and  then  map  these  intervals 
into  C  by  the  correspondence  t  -*■  Y{t).  Clearly  then  N^{e)  £  (2e/a^) 
for  small  enough  e,  so  that  we  have  polynomial  entropy  with  K=l/a. 

(Actually,  it  is  not  quite  true  that  N^(e)  (2e:/a^)"^'^“,  for  a  true 

-1  /a 

upper  bound  is  1  +  [(2e/a^)  ],  where,  here,  [x]  is  the  integer  part 

of  X.  Nevertheless,  to  make  life  a  little  easier,  let  us  agree  here 
that  henceforth  every  time  we  bound  an  entropy  by  some  non-integer,  we 
allow  ourselves  the  freedom  of  adding  a  minor  "integer-correction  factor' 
if  necessary.  This  involves  no  real  loss  of  precision.) 

To  obtain  (4.3),  we  shall  apply  Theorem  3.5.  For  this  we  need  a 
handle  on  the  function  n (6^, 62,9)  of  (3.19),  and  to  determine  the 
for  this  problem.  To  do  this,  fix  9,  and  let  6^  be  the  partition  just 
described,  but  based  on  intervals  of  length  (0/a^)^^“.  Subdividing 
each  AgG  according  to  the  same  principle,  we  easily  obtain 
N^®(6)  =  (0/a^)~^'^“  and  N^(e9)  ^  for  small  enough  e  and  0. 

FiX  p  ^2,  and  compare  this  with  (3.12).  We  see  we  can  take  f(0)  =  0 
there,  so  that  the  g(e)  of  (3.13)  is  given  by  g(e)  =  e(l+2p  ),  and 
the  e  of  (3.14)  by 

A 


(4.7) 


0^  =  X'''[(l+2p‘^)(l+4£np/a)’^]'\ 


Now  take  0(0)  <  6,  <  io(l)  and  consider  the  set  C  C 
—Id  0162 

It  is  easy  to  see  (we  leave  the  algebra  to  the  reader)  that  for 

62  -  ^1  <  di^y2  this  set  is  the  image  of  an  interval  in  [0,1]  of  length 

between  a^^  (62-61)^^“  and  a^^  (62"<5i 

To  finally  bound  n(6,,6.,,0):  =  #{AgG„:  AnC.^nC  ^  0}  for 

\  c.  u  0 1  0 
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<  Aq(0),  set  Aq{0)  =  0^.  Then  since  each  AeG^  is  the  image 
of  an  interval  of  length  0(e^^“)  and  C.^n  C”  the  image  of  an  interval 

27a  1  2 

of  length  at  most  0(0  ),  we  have  that  n (6^ ,62.6)  is  at  most  two.  Thus 

(3.19)  is  satisfied  for  all  positive  e  with  n.  =  2,  and  all  the 

0 

conditions  of  Theorem  3.5  are  satisfied.  Consequently  (3.21)  holds  for 

every  8  1  2a.  Take  e  arbitrarily  large  in  (3.21)  to  obtain  (4.3)  and 

prove  our  result  for  the  case  6  ^  2a. 

Now  take  6  <  2a.  Then  by  (4.6)  we  have  for  small  (t-sj  tnat 

llY(t)  -  Y(s)||  ^  a^  [t-sj^^^  .  The  argument  above  thus  gives  coly- 

nomial  entropy  with  <  =  2/e.  Defining  G  as  the  image  of  intervals 
2/S 

of  length  (0/a^)  '  ,  we  once  again  find  f(0)  =  0  but  now 

The  set  ct  C“  remains  as  it  was  above.  Again  take  A_(0)  = 

O^i  O2  U 

and  consider  n((52»'5i *0) .  If  B  e  (a,2a)  then,  once  again,  as  9  V  0 
we  find  n(6^,62,0)  ±2  for  '^2  ~  4^(9).  Consequently,  in  this 

case  the  argument  is  precisely  as  above,  and  we  now  have  (4.3)  for 
all  8  >  a. 

If  0  <  8  <  a  the  intervals  mapping  into  G„  can  be  shorter 

+  -  2/6 

than  those  mapping  onto  the  C.  C  ,  (lengths  0(9  )  versus 

°1  °2 

(62  -  ^0(9^^“)).  Consequently,  noting  that  Nj.®(9)  =  (0/a^)'^^®, 

we  obtain 

n(6^,62,9)  i2  +c(62  -  6^)^^“Nj,^(e) 

for  some  finite  c.  That  is,  we  have  the  right  bound  for  (3.19)  of 
Theorem  3.5.  Substitution  into  (3.21)  completes  the  proof. 


Our  remaining  examples  are  all  connected  with  Brownian  sheets. 

Let  be  Lebesque  measure  on  [0,1]  .  The  zero  mean  Gaussian  process 

1/ 

VI  defined  on  Bore!  sets  in  [0,1]  with  covariance 


(4.8)  E[W(A)W(B)]  =  B): 


is  called  the  set  indexed  Brownian  sheet.  The  pinned  version  of  W, 


denoted  by 


W(A)  *=  W(A)  -  X,,(A)W([0,lf) 


has  covariance 


(4.9)  E(W(A)W(B))  =  X^(AnB)  -  X^(A)X^(B), 


For  the  special  case  of  W  indexed  only  by  k-intervals  of  the  form 

k  o  o 

A*  =  n  [0,t.],  we  write  W(t):  =  W(A.)  and  W(t)  =  W(A,.),  and 

'  i=l  ^  o  '  5  ? 

call  W(t)  and  W(t)  the  point  indexed  sheet  and  pinned  sheet,  res¬ 
pectively. 

W(t)  is  of  particular  interest  as  the  natural  k-dimensional 

O 

generalisation  of  Brownian  motion  while  W(A)  arises  as  a  weak  limit 
in  an  empirical  measure  setting,  (c.f.  Dudley  (1978).)  We  start  with 
the  point  indexed  pinned  sheet. 


Example  4.2  Let  W  be  a  point  indexed  Brownian  sheet  on  [0,1]  .  Then 
there  exists  a  finite  c  such  that 


(4.10)  P{sup  JW(t)|  >  X}  <  CA' 


2(k-l)g-2x2 
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This  result  was  originally  established  in  somewhat  greater 
generality  in  Adler  and  Brown  (1985),  where  it  was  also  shown  that  this 
bound  serves,  for  different  c,  as  a  lower  bound  as  well.  It  is  not, 
however,  obtainable  from  any  other  general  Gaussian  bound.  Using 
Berman's  (1984a)  result,  or  our  Theorem  3.3,  the  best  bound  possible  is 
only  0(x^'^‘^e‘^^^)  . 

We  rederive  the  result  here  to  show  how  it  can  be  obtained  from  the 
general  theory.  Once  again,  we  shall  apply  Theorem  3.5,  so  we  are 
basically  concerned  with  finding  a  good  bound  for  n( 61,62 »e)»  and 
the  other  factors  in  (3.19). 

We  commence  by  noting 


(4.11) 


||W(t)  -  W(s)|F  =  E[(W(t)  -  W(s))2] 


<  a(A.aA  )  < 
5  § 


k 

z 

i=l 


tt.-s. 


> 


k  -2 

for  all  s,te[0,l]  .  Now,  for  each  0  >  0  set  m.  [ke  ] 

I, 

([x]:  =  integer  part  of  x)  and  define  the  partition  of  [0,1]  by 

0 


Ig  =  (AC  [0,1]' 


k 

n 


n.  n.+l 

(-5-  ^ 

Vm  » 


i=l  '  % 


■]  n  1  “0 , 1 , . . .  ,mg“l } 


Furthermore,  let  G.  be  the  partition  I.  induces  in  H,  the  space 

0  0 

o 

of  W.  By  (4.11),  if  x,yeAeG  ,  thenl|x-y||  <  e,  so  that  G  is 

y  I  M  _  y 

a  partition  of  the  type  required  for  Theorem  3.5,  and 


(4.12)  n/^(9)  =  [k/e^]*"  , 


30 
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i 


i 

V»V' 
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m 


i 
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i 
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the  inequality  following  by  simple  algebra.  By  (4.12),  C  has  polynomial 


entropy  with  k  ^2k.  We  now  check  the  scaling  property. 


2  k 

Fix  e  >  0,  set  p  :  =  [e  ].  divide  each  Acl.  into  p  equal 

e  U  6 


k-intervals,  and  map  these  into  the  corresponding  AeG  .  Applying 


(4.11)  once  again,  it  is  easy  to  check  that 


N^(0e)  <_  3e 


for  all  e  <  (2k)  ^  and  AeG  . 

y 


Thus  we  can  take  f(e)  =  e  in  (3.12)  and,  for  some  p  ^2, 


(4.13)  0^  =  x‘V(l+2p‘^)(l+8kiinp)^]''* 


All  that  remains  is  to  investigate  n(6i.62.0)-  Firstly  note  that  it 


suffices  to  consider  6-j  >  *s,  for  we  can  break  up  C  into  two  parts. 


over  which  x  <  h  and  x  >  Over  the  first  part  the  inequality 


(1.1)  gives  us  an  upper  bound  of  0(e  °  )  for  the  tail  of  the  supremum, 


which  is  clearly  of  smaller  order  than  the  desired  (4.10).  Thus  the 


case  1  ^  can  be  neglected.  Now  note  that  n  C”  is  the  image 

1  6^  62 


of  the  following  set,  in  which  we  write  [tj  for  t^  x  ...  x  t|^ 


(4.14)  1(6^, 62)  =  {t:  6^^  <  lt|(l-|t|)  <62^} 


=  {t-.is-  (it-6^^)‘'^  1  Itl  1^5  - 


^  |4.|  ^ 


U{t:  H+  {h  -  6  r  1  |t|  1  h+  {k  -  5^  )'^  } 


The  second  line  follows  via  a  little  elementary  algebra.  To  count  the 


number  of  A  from  that  intersect  1(5,,  6.,)  it  suffices  to  count 

9  I  c 


the  number  of  lattice  points  of  the  form  (n,/m„, . . . ,n, /m.,)  falling 

It)  K  T 


in  1(6^, 62^  out  tms  IS  relatively  easy,  tor  it  we  nx  ^ 

then  some  more  algebra  applied  to  (4.14)  shows  that  no  more  than  32/?’(62-'5 
values  of  are  permissible.  Allowing  n^,,..,n|^_^  to  vary,  we  thus 
obtain 

n(6^,62,e)  1  c(mg)'^"^(62-6^)^ng 

<  c(k)6'2‘'(62-6^)''^ 

<  0(62-6^  . 

But  this  is  all  we  need,  for  substitution  into  (3.21),  on  noting  that 
=  hi  for  this  problem,  immediately  establishes  the  required  (4.10). 


Example  4.3  Let  R|^  be  the  set  of  all  k-intervals  of  the  form 
k  k 

[s,t]  =  n  [s.,t.]  contained  in  [0,1]  .  Then  there  exists  a  constant 
i=l  ^  1 
c  such  that 

(4.15)  P{sup|W{A)|>x)  < 

■'k 

Before  we  prove  this  result,  we  shall  establish  its  sharpness  by 
showing  that  there  exists  a  c  such  that 


(4.16) 


c-x2(2k-l)^-2A2 


^  P{sup  W(x)  >  X}. 

R,. 


We  shall  prove  this  for  k=2.  For  k>2  the  proof  is  basically  the 
same,  the  notation  is  just  a  little  longer.  Let  A  =  [s,t]  be  a 
rectangle  in  [0,1]^  ,  and  define  a  mapping  T:R2  [0,1]^  by 


T([s,t]): 


2 


,V 


Clearly  we  must  have  0  s^.  _<  ^  1 ,  i  =1 ,2  for  [s,t]  to  be  in  R2  » 

and  so  it  is  easy  to  see  that  T  is  one-one  and  onto.  The  inverse 
mapping  is  defined  by 

(4.17)  T"^(z^  ,Z2.Z2,z^)  =  [(z^Cl-z^ ).z^(l-Z3)),  (z2,z^)]. 

Now  define  a  process  X(z)  on  [0,1]^  by  X(z)  =  W(T'^(z)). 

This  process  is  clearly  Gaussian  with  zero  mean,  and  it  follows  from 

(4.17)  and  (4.8)  that 

(4.18)  E[X^(z)]  =  *(?■'(;))  =  III  -  |2|^ 

This  is  the  variance  of  the  point  indexed  sheet  on  [0,1]^.  After  a 
page  or  so  of  elementary  algebra,  one  can  also  derive  the  rather  useful 
inequality  that  for  any  A,BeR2  > 

4 

x(AnB)  ^  n  rT.(A)  aT,(B)] 

i=r  ^  ^ 

where  T^(A)  is  the  i-th  coordinate  of  T(A).  An  immediate  consequence 
of  this  is  that 

4 

E[X(u)X(v)]  =  E[W(T'^u)W(T"\)]  i  n  u-aV.  -  lul-jy]. 

i=l 

That  is,  the  covariance  function  of  X  is  dominated  by  that  of  the  point 
indexed  sheet  on  [0,1]^.  Consequently,  by  (4.18)  and  Slepian's  inequality 
(Slepian  (1962)),  the  tail  of  supX  dominates  that  of  the  sheet.  Theorem 
2.2  of  Adler  and  Brown  (1985)  states  that  this,  in  turn  dominates 
c'A®e"^^^  for  some  c,  (or  c ^e‘^^  for  general  k),  so  that 


(4.16)  is  proven. 
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Now  to  the  upper  bound.  We  shall  give  the  main  steps  of  the 
derivation  and  skip  all  the  algebra,  most  of  which  is  similar  to  that 
in  the  previous  example.  To  define  ,  set  m^  =  [Zk/e^],  and  let 
Gg  be  the  image  in  H  of  the  partition  of  given  by 
where  L|^(9)  is  the  set  of  all  integer  2k-tuples  of  the  form 

with  . i=l . k. 

=  0,1 , . . .  ,m.-l ,  i=l,...,k,  t=l,2  and  A(J)  is  the  collection  of 

It) 

all  k-intervals  [x,y]  satisfying  '"’el  ® 

|y^-j^  ^^Vnigi  <  0^/2k,  i  =  l,...,k.  It  is  easy  to  see  that  G^  is  a 
partition  of  the  required  form,  and  that 


N/(e) 


-  .k,  2k„-4k 
3.4  k  0 


=  ce 


-4k 


Consequently  we  have  polynomial  entropy  with  parameter  <  =  4k.  Con¬ 
tinuing  the  same  procedure,  it  is  easy  to  see  that,  for  each  AeG„  , 

-4k 

N^(0e)  ^  Ce  ,  SO  that  as  in  the  previous  case  we  have  f(0)  =  0  and 
=  ex-' . 

Now  consider  C^  n  C”  ,  which  we  can  write  as 

61  62 

k  k  k 

{B=  n  [x.  ,y.]:  <  n(y.-x.)  -  r  n  (y.-x.)]^  ^  . 

i^l  1  1  n  1  1  1  1  T 

Again  we  can  assume  6^  >  h,  and  follow  the  procedure  of  the  previous 
example  to  eventually  obtain 

n(6^562»®)  1  '^2''^1  ^ 

Substituting  all  the  above  into  Theorem  3.5,  together  with  the  fact  that 
a=h,  we  prove  (4.15) 
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The  previous  two  examples  almost  seem  to  indicate  that  in  working  with 
Brownian  sheets  it  is  only  the  dimensionality,  d,  of  the  parameter  space 


that  determines  the  power  of  x  in  our  bound.  For  example,  for  W  on 


[0,1]*^,  we  have  d=k  and  the  bound  is  c .  For  W  on 


R|^,  we  have  d=2k  (each  AeRj^  can  be  specified  by  2k  parameters) 


and  the  bound  is  again  Cx^^*^  2x  ^  again  in 


treating  W  indexed  by  all  half-squares  in  ,  (which  we  shall  write 
as  V^:  ={A<=[0,1]2:  A  =  [0,1]2  n  {(x,y):  ax  +  sy  +  y  l  0  some  a,6,Ye[-“, 


for  which  d=2  . 


Example  4.4  For  the  Brownian  sheet  indexed  by  half-squares,  we  have 

-2x2 


(4.19)  P{suplW(A)|  >  X}  <cx2e' 
”2 


for  some  finite,  positive  c. 

To  commence  the  proof  of  (4.19)  note  firstly  that  if  AeP^ 


then  W(A)  =  -W(A  ).  Consequently  we  need  only  consider  half  of  , 


say  those  half  squares  that  contain  at  least  one  of  the  points  (1,0) 
or  (1,1).  We  write  this  as  . 

Let  S^,...,S^  denote  the  four  sides  of  the  unit  square, 

{(x,y):  0  ^  x,y  ^1}  on  which,  respectively,  x=0,  x=l ,  y=0,  y=l .  To 


■2n  (k), 


define  G^,  set  m^  =  [e  ]  and  x^'  '(e)  the  point  on  at  a 


distance  i/m„  from  its  start.  Now  let  A(0,k,«,,i , j)  be  the  collection 
u 


+  ( k  ^ 

of  all  half  planes  in  V  with  boundary  intersecting  S  between  x.  ' 

^  k  ^ 


and  xjljj,  and  between  and  x(jj  .  (k,;i=l , . . .  ,4,k?<;  ,i ,  j= 


0,1,..., m^).  These  A  clearly  provide  a  partition  of 


we 


take  the  induced  partition  in  the  space  of  W  as  G  .  Clearly  G 

6  0 


has  the  properties  we  generally  require  and,  furthermore 


35 


(4.20)  N^^(9)  =  (2)(mg+l)2  ^  246''^  . 

Consequently  we  have  polynomial  entropy  with  <=4.  To  further  subdivide 
these  sets,  simply  subdivide  each  interval  [x^.  more  finely, 

so  that  simple  calculations  yield  that  N^(ee)  i  for  each  such  A, 

Consequently  f(e)  =  e  and  for  p  ^2 

9^  =  A'\(l+2p"^)(l+8£np)’^]"^  . 

It  remains  to  estimate  n(6^,62,9)  ,  for  which  we  must  describe 

C^  n  C^  .  As  before,  this  is  made  up  of  the  image  of  all  half  squares 
12  2 

whose  intersections  with  [0,1]  have  area  S  satisfying  either 

(4.21)  a^  =  is  +  (J«  -  62^)^  iS  <  is  +  (is  -  6^^)^  =  b^ 
or 

(4.22)  a2  =  is  -  (H;  -  <S  -  (k  -  =  b2 

We  further  divide  C^  n  C^  ,  into  the  image  of  half  squares  whose 

2  1  2 

intersection  with  [0,1]  is  a  proper  quadrilateral,  and  those  that  yield 
a  triangle.  We  shall  count  only  the  first  case,  the  second  can  be  treated 
similarly,  and  yields  same  order  of  magnitude  bounds  on  n(6.|,62,9).  Clearly, 
because  of  symmetry,  we  need  only  treat  quadrilaterals  including  all  of 
the  side  $2  ,  for  we  then  simply  add  a  factor  of  two  to  our  counting  to 
account  for  the  side  . 

Such  quadrilaterals  can  be  parametrized  by  two  points  u  and  v 
representing,  respectively  the  points  of  intersection  of  the  boundary  of 

2 

the  half  plane  with  the  sides  and  of  [0,1]  .  Then  the  area 
of  the  quadrilateral  is  given  by  l-ii(u+v).  For  such  a  quadrilateral  to 


it  thus  follows  from  (4.21)  and 


be  in  the  pre-image  of 
(4.22)  that 


C.  f\  C, 


2(1-b^)  1  u  +  V  2(l-a^)  for  i=1  or  2. 

Similarly,  if  the  coordinates  x(^^(0)  and  x.^‘^^(0)  on  S.,  and 
define  a  half  square  whose  image  lies  in  C.  a  ,  then 

6i  §2 


S 


4 


(4.23)  2(l-b.)mQ  <  i^+i2  <  2(l-a.)mQ 


for  i=l  or  2. 


For  fixed  the  number  of  pairs  (ii,i2)  satisfying  (4.23)  is 

no  more  than  2m2(b.-a.)  .  Now  note  that  via  a  little  algebra 
6  11 


16(b,-a,)  =  (l-4s/)‘‘  -  (1-4«2^)‘‘ic(62-6,)'= 


Using  this  and  all  the  above  we  find  that  for  small  enough  62-5-1. 

n(6^  ,62,0)  lc(62-6^)Sng2 

=  c(62-6^)^N^(0)  . 

Now  apply  Theorem  3.5  and  the  fact  that  a  =  *5  to  obtain  (4.19)  and  so 
complete  the  proof. 


•  >  V  ~J- 


37 


5.  PROOFS  FOR  SECTION  3 

We  need  firstly  to  establish  (3.2),  i.e.  for  p  ^  2 
k 

and  all  A  >  (l+4KAnp) 

(5.1)  P{sup|Lxl  >  x(a+2p"^)}  <  f-ap^^e'^^^^du  . 

C  X 


Our  starting  point  is  the  basic  inequality  (2.9).  There,  put  ni=l ,  so 
that  6g=0»  5]=  o>  and  there  is  only  one  x  se<^uence  and  one  e  sequence 


-i/O 

Set  e.  =  p  and  x.  =  X2'''  .  Then  (2.9)  becomes 
w  J 


(5.2) 


P{sup|Lx|  >  X(a  +  z  2'^^^p"^  )}  ^a  z  p 


j=l 


j=0 


ij^(x2'^^^)  . 


The  sums  are  easy  to  calculate.  Following  Fernique  (1975),  for  j  .>  0 

p  i|)(X2'^^^)  =  vf/V  /  exp[(c2‘^''’^Jinp  +  i5j£n2  -  u^2'^”^]du 

X 

00  , 

_<  v^f/Tr'  f  exp[-i5u2  +  2<2,np  +  !5(j)in2  +  l-2'^)]du, 

X 

if  X  >  (l+4Kiinp)  .  Consequently  the  rightmost  sum  in  (5.2)  is  bounded  by 

ap%(x)  z  2’^/^exp  is(l-2’^). 
j=0 

Evaluating  the  sum  gives  the  upper  bound  in  (5.1)  with  a  little  room  to 

spare  for  the  constant.  The  leftmost  sum  in  (5.2)  is  easily  bounded  by 
_2 

2p  ,  and  so  (5.1)  is  established. 

We  can  now  start  proving  the  theorems  of  section  3. 


Proof  of  Theorem  3.1  We  commence  with  (5.1).  Note  firstly  from  the  proof 

-2  -k 

of  (5.1)  we  require  p  1  Cq»  ''•e-  P  >  Gq  .  We  have  also  required  Pi  2. 


Then  by  (5.1)  and  the  fact  p  ^2  we  have  that  for  x  >  (a+ii)2(l+4<£np)^ 

(5.3)  P{suplLxl  >  X}  <  |-ap^7  e'*^  ''^du 

C  X/(a+2p“2) 

<  I  ap^'"x'7a+2p"^)exp{-x2/2(a+2p“^)2}  , 

the  last  line  via  the  standard  inequality.  Now  set  p=x  in  (5.3), 
which  can  be  done  if  we  take  X  ^max(2,0Q'^)  and  x  >  (a+is)2(i+4Kii.nx) . 
Simple  algebra  converts  these  to  the  conditions  on  x  given  in  the  state¬ 
ment  of  the  theorem.  Then  on  substitution,  we  obtain 

(5.4)  P{suplLx|  >  X}  <  I  ax^'^’7a+Js)exp{-x2/2(a+2x‘^)2}  . 

C  ^ 

Under  the  conditions  we  have  on  x  ,  it  is  easy  to  check  that  the  exponent 

2  4 

here  is  bounded  above  by  x^/aa^  -  2(a+x  )/a  .  This  completes  the 
proof. 

Proof  of  Theorem  3.2  We  shall  not  keep  track  of  the  constants  of  the 
Theorem  throughout.  Doing  so  more  than  doubles  the  length  of,  and  compli¬ 

cates,  an  otherwise  simple  argument.  The  interested  reader  can  check  the 
constants  by  adding  to  the  following  argument  some  simple  algebra. 

Fix  6e  (0,a),  choose  X  large,  and  note  that  we  can  always  choose 
f  so  that  f(6)  <  1.  Then  define 

Pi  =  [^■^+^0-6)]'^  ,  p^  =  xf^(6)  . 

Both  p.|  and  ^2  3'^®  ^®ss  than  x  .  Apply  (5.3)  to  the  two  sets 
Cl  *•=  n  C"^  and  C2  •=  ,  using  p^  and  p^,  respectively. 


in  place  of  the  p  there.  We  find 


(5.5)  P(sup|Li<|  >  X)  < 

<=1 

bounding  the  exponent  in  (5.3)  as  in  (5.4).  Furthermore 

(5.6)  P(suplLx|  >  x)  <_  c[xf^(x)]^''x'^e‘^^'^^‘^^  . 

<=2 

Combining  (5.5)  and  (5.6)  proves  the  theorem. 

Proof  of  Theorem  3.3  The  idea  of  the  proof  is  simple.  If  0  is  small, 
then  so  are  the  sets  in  .  For  AeG  ,  choose  some  x*eA.  For  each 
xeA,  write  L  =  Lx*  +  L(x-x*).  Since  j] x-x*|l  must  be  small,  L(x-x*) 
should  be  also  small  (stochastically).  To  show  this  we  consider  L(x-x*) 
conditional  on  Lx*,  using  an  idea  used  previously  in  Adler  and  Brown 
(1985)  and  Berman  (1984a)  for  certain  Gaussian  processes  on  R*^.  Con¬ 
sequently,  Lx  =  Lx*  +  a  smaller  order  term.  Precise  estimates  are 
given  in  the  theorem.  The  details  of  the  proof  are  as  follows. 

Take  AeG  and  let  x*  be  a  point  in  A  satisfying  ||x*|I  =  supl|xl| 
^  A 

i.e.  X*  has  maximal  norm  in  A.  (Such  an  x*  exists,  for  we  lose  no 

generality  in  assuming  A  closed,  and  our  assumption  of  finite  entropy 

then  guarantees  compactness  and  so  the  existence  of  x*.)  Consider  the 

process 

L*x:  =  L(x-x*)  =  Lx  -  Lx*  , 


and  let  A*  be  its  image  in  L^(q,P).  Let  I  be  the  (identity)  operator 
on  A*  that  simply  identifies  each  element  of  4^  as  a  Gaussian  variable. 
The  inner  product  (u,v)  of  u=L*x  and  v=L*y  in  A*  is  given  by 
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E{L*x.L*y),  I  is  isonormal  on  A*  and  sup^jluj  =  supjL*x|.  Furthermore, 
it  is  trivial  to  check  that 

sup  Hull  *<  02.  and  |Iu-vl|*=  llx-y|l 
A* 

Thus  the  entropy  function  for  I  is  identical  to  that  for  L  on  the 
original  space.  Now  recall  the  proof  of  (5.1).  Rework  it  for  I  on 
A*,  noting  condition  (3.12),  with  p  replaced  by  pf”'®(e).  This  gives 

00  A 

(5.8)  P{sup|L*x|  >  x[0+2f(0)p'^3}  <  Iflip^*";  e"'^  du  . 

A  X 

Furthermore,  precisely  the  same  bound  holds  if  we  replace  L*x  by 
L**(x)  !=  Lx  -  E(Lx|Lx*).  This  follows  as  for  L*,  on  noting  that 

<_  ||u-v|L^  ,  which  follows  from  an  easy  calculation  on  conditional 
variances . 

Now  note  that  the  event  that  interests  us,  sup j Lx]  >  x,  is  included 

A 

in  the  union  of  the  four  events: 


(5.9) 

lLx*l  >  X  -  g(0)(l+4KJinp)*^, 

(5.10) 

sup|L*x|  >  X, 

A 

(5.11) 

supLx  >  X  and  0  <  Lx*  <  X  -  g(0) (l+4iciinp)^  , 

A 

(5.12) 

inf  Lx  <  -X  and  -X  +  g(0)(l+4K)inp)^  ^  Lx*  ^0 

A 

The  probability  of  (5.9)  is  bounded  by  the  first  term  in  (3.13), 
while  the  second  term  there  bounds  the  probability  of  (5.10)  by  (5.8). 


•  '•TP. 


- 
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The  probabilities  of  (5.11)  and  (5.12),  which  are  clearly  identical,  are 
a  little  more  involved  to  derive. 

Note  first  that  by  well  known  properties  of  Gaussian  variables 

E(Lx|Lx*  =  n)  =  £  r, 

if  n  >  0,  since  x*  is  a  point  of  maximal  norm.  Consequently 
E(Lx|Lx*)  ^  Lx*  on  the  set  where  Lx*  ^0,  and  so  (5.11)  is  contained 
in  the  event 

supL**x  >  X  -  Lx*  and  0  ^  Lx*  -  g(e) (l+4Kinp)*^. 

A 

But  L**x  and  Lx*  are  independent,  so  the  probability  of  this  event 
can  be  bounded  by 

Y 

J  P(sup  L**x  >  X-u)p(u/aJa.‘^du 
0  A  A  A 

with  Y  =  X-g(e)(l+4KJinp)  .  Applying  (5.8)  for  L**,  we  can  bound  this 
by 

Setting  z  =  x(x-u),  this  can  be  further  bounded  by 

Noting  that  P(x+y)  £P(x)e  ^  for  all  x,y,  we  can  further  bound 
the  above  by 
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This  is  now  a  standard  integral,  and  turns  out  to  be  no  more  than  half 
the  last  factor  in  (3.13).  This  completes  the  proof  of  Theorem  3.3. 

Proof  of  Theorem  3.4.  Consider  Corollary  3.2  for  A's  belonging  to 
Gg  n  C^  and  ,  where  6c(0,a).  Noting  the  dependence  of 

in  Corollary  3.2  on  a  by  writing  c^(o),  we  find 

P{suplLx|  >  x)  <.  n(6,e^){c^  (a)x  ^e  ^  +C2X~^exp[-i5X‘*(l+4Kj!,np)]} 

c 

+  [N®{0^)-n(6,0^)]-{c^  (6)x"^e‘^^^^'^^+C2x‘^exp[-J5X‘+(l+4K)inp)]} 

Along  with  the  other  restrictions  on  x,  now  take  x  >  [62(l+4icJ!,np)]”^. 
Then  applying  (3.16)  to  the  above  we  obtain,  changing  constants  at  will. 


(5.13)  P(suplLx|  >  X)  <  cN®(0j{(a-6) 


+  cng 


Choose  6  =  a-x"^iinx^®°  ,  taking  x  large  enough  so  that  6c{ho,a), 
and  note  that  for  this  6 


and 


5-x2/2«2  ,  ^  (1  +  sh£.)i 


62 
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=  exp{  -—[!+(—) 

2a2  62 

<  exp{  -  —  -  ^  ;^nx} 

2a2  ‘5 

<  r2ee-^"/2a2  ^ 

Substituting  these  last  two  inequalities  into  (5.13)  establishes  (3.17) 
and,  thus,  the  theorem. 

Proof  of  Theorem  ?.5  We  work  from  Corollary  3.3.  For  fixed  x  define 
the  sequence  {6.}  given  by 

^0  “  ^i^  “  . 

where  m  »=  [ija^x^]  .  Clearly  it  will  suffice  for  us  to  bound 
P(suplLxl  >  x).  Apply  Corollary  3.3  to  obtain 

<0 

P{SUp|Lx|  >  A}  <  C  I  n(6.  ,,6.,6)x  exp(-*5X2/6,2). 

c*  *='  ' 

Note  that  <5^-6^_^  ^  l/(ax2)  .  Take  x  large  enough  for  (3.19)  to 
hold,  and  substitute  to  bound  the  above  sum  by 

(5.14)  c[o  ^  '^^N  (0.)  +  n^  X-'].  I  exp(-isx2/6.2) 

^  ®x  i=l  ' 

Thus  to  complete  the  proof  we  need  only  bound  the  last  summation  by 
g  *sA  /a  follows.  Set 


=  exp{->2A2/(a2-(m-i)A~  )} 


It  is  easy  to  check  that 


Thus  the  sum  in  (5.14)  is  bounded  by 


,.-l/2a\k  “m 
k=l  “l-e‘^^° 


which  completes  the  proof. 


Remark:  The  astute  reader  may  have  noticed  that  at  no  point  in  any  of  our 
proofs  have  we  used  the  full  power  of  the  Basic  Inequality  (2.9),  in  that 
we  have  not  taken  advantage  of  the  e  and  \  double  sequences  to  partition  C 
according  to  variance  (i.e.  into  the  sets  of  Gg).  The  reason  is  that, 
while  doing  so  we  can  improve  on  the  standard  upper  bounds,  we  cannot 
reach  the  sharpness  of,  say.  Theorem  3.5  without  an  intermediate  result  like 
Theorem  3.3.  In  fact,  it  is  the  careful  conditioning  argument  that  goes  in¬ 
to  the  proof  of  Theorem  3.3  that,  ultimately,  makes  everything  work. 


6 .  SOME  COMMENTS 


1.  Lower  bounds.  Throughout  this  paper  we  have,  with  the  exception  of 
Example  4.3  treating  rectangle  indexed  sheets,  dealt  only  with  upper  bounds 
for  the  excursion  probability.  The  fact  that  in  every  example  for  which 
lower  bounds  are  available  we  find  that  our  upper  bounds  are  sharp  in  the 
power  of  A  leads  one  to  believe  that  they  may  be  sharp  in  general. 

This,  however,  does  not  seem  to  be  easy  to  prove.  Some  lower  bounds  are 
available  from  Weber  (1980)  and  these,  like  his  upper  bounds,  are  sharp  for  pro¬ 
cesses  with  constant  variance.  For  the  highly  nonstationary  examples  of  Section 
4  they  do  not  provide  bounds  that  match  our  upper  bounds .  As  for  upper  bounds ,  how¬ 
ever,  it  is  easy  to  see  by  example  that  lower  bounds  that  depend  only  on  entropy 
without  taking  into  consideration  varying  variance  can  never  be  sharp  for  all  cases. 

2.  Vapnik-Cervonenkis  classes.  The  natural  geometric  structure  of  VC  classes 
of  sets  or  functions  should  be  enough  to  generate  some  of  the  homogeneity 
of  C  required  by  our  theorems.  Furthermore,  the  fact  that  each  VC  class 
has  a  natural,  single  parameter  describing  its  structure  (and,  in  a 
certain  sense,  its  "dimensionality")  seems  to  indicate  that  it  should  be 
possible  to  apply  our  results  to  VC  classes  in  such  a  way  that  this  para¬ 
meter  enters  in  a  simple  fashion  into  the  power  of  A.  We  have  found  indi¬ 
cations  that  this  should  be  true,  but  have  been  unable,  so  far,  to  put 
together  a  serious  proof. 

3.  Exponential  entropy.  The  exponent  of  entropy  of  C  is  defined  by 

T  =  T(C)  =  lim  sup  log  log  N(C,e)/log(l/e) . 
e+0 

For  L  to  be  continuous  on  C  we  must  have  r  ^2.  If  r  <  2,  L  is 
continuous.  For  r*2  there  are  examples  of  both  continuous  and  discontinuous 
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(and  hence  unbounded)  processes.  By  assuming,  as  we  have  since  Section  3, 
polynomial  entropy,  we  assume  r=0,  thus  leaving  out  many  interesting 
examples.  In  particular,  we  cannot  handle  many  set  indexed  sheet  pro¬ 
blems.  (See  Dudley  (1973,  78)  for  examples.)  Furthermore,  bounds  of  the 
form  A  e  are  not  valid  in  this  case.  Nevertheless,  a  result  of 
Borell  (1975,  p.  214,  middle  of  proof)  states  that  for  all  a.s.  bounded 
Gaussian  processes  there  is  a  bound  of  the  form  exp(-V2A  + const.  X).  In 
fact,  Borell 's  result  can  be  improved  on,  and,  under  mild  conditions  it 
is  possible  to  show  that  there  is  a  function  a:  [0,2] -►  [0,1]  for  which 
bounds  of  the  form  exp(-V2X^  +  const.  hold.  We  shall  report  this 

separately. 
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